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ILLUSTRATIVE  EXAMPLES  OF  PRINCIPAL  COMPONENT  ANALYSIS 

USING  SYSTAT/ FACTOR* 

W.  T.  Federer,  C.  E.  McCulloch  and  N.  J.  Miles-McDermott 

BU-901-M  November  1986 

( 

ABSTRACT 

s'  In  order  to  provide  a  deeper  understanding  of  the 
workings  of  principal  components,  four  data  sets  were 
constructed  by  taking  linear  combinations  of  values  of 
two  uncorrelated  variables  to  form  the  X-variates  for 
the  principal  component  analysis.  The  examples 
highlight  some  of  the  properties  and  limitations  of 
principal  component  analysis. 

This  is  part  of  a  continuing  project  that  produces 
annotated  computer  output  for  principal  component 
analysis.  The  complete  project  will  involve  processing 
four  examples  on  SAS/PRINCOMP,  BMDP/4M,  SPSS-X/FACTOR, 
GENSTAT  /  PCP,  and  SYSTAT  /  FACTOR.  We  show  here  the 
results  from  SYSTAT/ FACTOR,  Version  3.  s _ _ 


*  Supported  by  the  U.S.  Army  Research  Office  through  the  Mathematical 
Sciences  Institute  of  Cornell  University. 


1.  INTRODUCTION 


Principal  components  is  a  form  of  multivariate  statistical 
analysis  and  is  one  method  of  studying  the  correlation  or 
covariance  structure  in  a  set  of  measurements  on  m  variables  for 
n  observations.  For  example,  a  data  set  may  consist  of  n  =  260 
samples  and  m  =  15  different  fatty  acid  variables.  It  may  be 
advantageous  to  study  the  structure  of  the  15  fatty  acid 
variables  since  some  or  all  of  the  variables  may  be  measuring  the 
same  response.  One  simple  method  of  studying  the  correlation 
structure  is  to  compute  the  m(m-l)/2  pairwise  correlations  and 
note  which  correlations  are  close  to  unity.  When  a  group  of 
variables  are  all  highly  inter-correlated,  one  may  be  selected 
for  use  and  the  others  discarded  or  the  sum  of  all  the  variables 
may  be  used.  When  the  structure  is  more  complex,  the  method  of 
principal  components  analysis  (PCA)  becomes  useful. 

In  order  to  use  and  interpret  a  principal  components  analysis, 
there  needs  to  be  some  practical  meaning  associated  with  the 
various  principal  components.  In  Section  2  we  describe  the  basic 
features  of  principal  components  and  in  Section  3  we  examine  some 
constructed  examples  using  SYSTAT/FACTOR  to  illustrate  the 
interpretations  that  are  possible.  In  Section  4  we  summarize  our 
results. 


2.  BASIC  FEATURES  OF  PRINCIPAL  COMPONENT  ANALYSIS 

PCA  can  be  performed  on  either  the  variances  and  covariances 
among  the  m  variables  or  their  correlations.  One  should  always 
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check  which  is  being  used  in  a  particular  computer  package 


program.  SYSTAT  can  use  either  the  variances  and  covariances  or 
the  correlations  but  uses  the  correlations  by  default.  First  we 
will  consider  analyses  using  the  matrix  of  variances  and 
covariances.  A  PCA  generates  m  new  variables,  the  principal 
components  (PCs) ,  by  forming  linear  combinations  of  the  original 


variables. 


X  =  (Xt  ,  X2 , . . • ,  Xm) ,  as  follows: 


PC1  =  bllXl  +  b12X2  +-*-+  blmXm  =  *>! 


pc2  =  b21xx 


b22X2  +*  *  *+  b2mXm  “  Xb2 


PC  = 
m 


b„lXl  + 


bm2X2 


+  .  .  .  + 


b  X  =  Xb 
mm  m  m 


In  matrix  notation, 


f 


p  =  (pclfpc2,...,pcn)  =  x  (b1,b2,...,bm)  =  XB, 
and  conversely  X  =  P  B-1 

The  rationale  in  the  selection  of  the  coefficients,  b..,  that 

^  J 

define  the  linear  combinations  that  are  the  PC^  is  to  try  to 

capture  as  much  of  the  variation  in  the  original  variables  with 
as  few  PCs  as  possible.  Since  the  variance  of  a  linear 
combination  of  the  Xs  can  be  made  arbitrarily  large  by  selecting 
very  large  coefficients,  the  b^j  are  constrained  by  convention  so 

that  the  sum  of  squares  of  the  coefficients  for  any  PC  is  unity: 

2j=1  bij  =  1  i=l,2, — ,m 

Under  this  constraint,  the  b^  in  PC1  are  chosen  so  that  PC1  has 
maximal  variance. 
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If  we  denote  the  variance  of  X^  by  s^  and  if  we  define  the 

tti  o 

total  variance,  2“=1  sf  •  as  T,  then  the  proportion  of  the 
variance  in  the  original  variables  that  is  captured  in  PC1  can  be 
quantified  as  var(PC1)/T.  In  selecting  the  coefficients  for  PC2, 
they  are  further  constrained  by  the  requirement  that  PC2  be 


uncorrelated  with  PC, 


Subject  to  this  constraint  and  the 


constraint  that  the  squared  coefficients  sum  to  one,  the 

coefficients  b. .  are  selected  so  as  to  maximize  var(PC  ) . 
^  J 

Further  coefficients  and  PCs  are  selected  in  a  similar  manner,  by 
requiring  that  a  PC  be  uncorrelated  with  all  PCs  previously 
selected  and  then  selecting  the  coefficients  to  maximize 
variance.  In  this  manner,  all  the  PCs  are  constructed  so  that 
they  are  uncorrelated  and  so  that  the  first  few  PCs  capture  as 
much  variance  as  possible.  The  coefficients  also  have  the 
following  interpretation  which  helps  to  relate  the  PCs  back  to 

the  original  variables.  The  correlation  between  the  ith  PC  and 


the  jth  variable  is 


bijv^var  (PCi)/Sj  . 


After  all  m  PCs  have  been  constructed,  the  following  identity 


holds: 


var(PC.)  +  var(PC_)  +...+  var(PC  )  =  T  =  2?  .  s? 

x  &  in  1. — x  x 


This  equation  has  the  interpretation  that  the  PCs  divide  up  the 
total  variance  of  the  Xs  completely.  It  may  happen  that  one  or 
more  of  the  last  few  PCs  have  variance  zero.  In  such  a  case,  all 
the  variation  in  the  data  can  be  captured  by  fewer  than  m 


variables.  Actually,  a  much  stronger  result  is  also  true;  the 
PCs  can  also  be  used  to  reproduce  the  actual  values  of  the  Xs, 


not  just  their  variance.  We  will  demonstrate  this  more 
explicitly  later. 

The  above  properties  of  PCA  are  related  to  a  matrix  analysis 
of  the  variance-covariance  matrix  of  the  Xs,  Sx*  Let  D  be  a 

diagonal  matrix  with  entries  being  the  eigenvalues,  ,  of  sx 

arranged  in  order  from  largest  to  smallest.  Then  the  following 
properties  hold: 

(i)  X±  =  var(PC.) 

(ii)  trace (S^)  =  2?=1  s?  =  T  =  2?=1  A.  =  2*=1  var(PC.) 

bii^i 

(ill)  corr(PCi,Xj)  =  J 


Sj 


(iv)  Sx  =  B'DB  . 


The  statements  made  above  are  for  the  case  when  the  analysis 
is  performed  on  the  variance-covariance  matrix  of  the  Xs.  The 
correlation  matrix  could  also  be  used,  which  is  equivalent  to 
performing  a  PCA  on  the  variance-covariance  matrix  of  the 
standardized  variables, 


Yi  = 


Xi  -  Xi 


si 


PCA  using  the  correlation  martrix  is  different  in  these  respects: 
(i)  The  total  "variance"  is  m,  the  number  of  variables. 

(It  is  not  truly  variance  anymore.) 

(ii)  The  correlation  between  PC^  and  X^  is  given  by 
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b^vVar (PC^)  =  (called  component  loading  in 

SYSTAT) .  Thus  PC^  is  most  highly  correlated  with  the  X^ 

having  the  largest  coefficient  in  PC^  in  absolute  value. 

The  experimenter  must  choose  whether  to  use  standardized  (PCA  on 

a  correlation  matrix)  or  unstandardized  coefficients  (PCA  on  a 

variance-covariance  matrix) .  The  latter  is  used  when  the 

variables  are  measured  on  a  comparable  basis.  This  usually  means 

that  the  variables  must  be  in  the  same  units  and  have  roughly 

comparable  variances.  If  the  variables  are  measured  in  different 

units,  then  the  analysis  will  usually  be  performed  on  the 

standardized  scale,  otherwise  the  analysis  may  only  reflect  the 

different  scales  of  measurement.  For  example,  if  a  number  of  fat- 

.  2  - 
ty  acid  analyses  are  made,  but  the  variances,  s^,  and  means,  , 

are  obtained  on  different  bases  and  by  different  methods,  then 
standardized  variables  would  be  used  (PCA  on  the  correlation 
matrix) . 

To  illustrate  some  of  the  above  ideas,  a  number  of  examples 
have  been  constructed  and  these  are  described  in  Section  3.  In 
each  case  two  variables,  Z1  and  Z2  ,  which  are  uncorrelated,  are 

used  to  construct  X^.  Thus,  all  the  variance  can  be  captured 

with  two  variables  and  hence  only  two  of  the  PCs  will  have 
nonzero  variances.  In  matrix  analysis  terms,  only  two  eigenvalues 
will  be  nonzero.  An  important  thing  to  note  is  that  in  general, 
PCA  will  not  recover  the  original  variables  Z1  and  Z2*  Both 

standardized  and  nonstandardized  computations  will  be  made. 


3 .  EXAMPLES 


Throughout  the  examples  we  will  use  the  variables  Z1  and  Z2 

(with  n  =  11)  from  which  we  will  construct  X1#X2, . . . ,Xffi.  We  will 

perform  PCA  on  the  Xs.  Thus,  in  our  constructed  examples,  there 
will  only  really  be  two  underlying  variables. 

Values  of  Z ,  and  Z_ 


Notice  that  Z1  exhibits  a  linear  trend  through  the  11  samples  and 

Z2  exhibits  a  quadratic  trend.  They  are  also  chosen  to  have  mean 

zero  and  be  uncorrelated.  Z1  and  Z2  have  the  following  variance- 

covariance  matrix  (a  variance-covariance  matrix  has  the  variance 

for  the  ith  variable  in  the  ith  row  and  ith  column  and  the 

covariance  between  the  ith  variable  and  the  variable  in  the  i1 

row  and  column) . 

Variance-covariance  matrix  of  Z..  and  Z_ 

[»  0  1 

l  0  85. 8J 

Thus  the  variance  of  Z1  is  11  and  the  covariance  between  and  Z 
is  zero.  Also  the  total  variance  is  11  +  85.8  =  96.8. 

Example  1:  In  this  first  example  we  analyze  Z1  and  Z2  as  if  they 
were  the  data.  Thus  X1  =  Z^  and  X2  =  Z2  and  m  =  2.  If  PCA  is 


perfomed  on  the  variance-covariance  matrix,  then  the  SYSTAT 
output  is  as  follows  (SYSTAT  control  language  for  this  example 
and  all  subsequent  examples  is  in  the  appendix  and  the  bold  face 
print  was  typed  on  the  computer  output  to  explain  the  calculation 
performed) : 
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i  =  yi 


2  =  y2 


XI 

0.000 

0.302 

X2 

0.108 

0.000 

XI 

X2 

FACTOR ( 1 ) 

FACTOR ( 2 ) 

“PCi 

=PC2 

CASE 

1 

-5.000 

15.000 

15.000 

-5.000 

CASE 

2 

-4.000 

6.000 

6.000 

-4.000 

CASE 

3 

-3.000 

-1.000 

-1.000 

-3.000 

CASE 

4 

-2.000 

-6.000 

-6.000 

-2.000 

CASE 

5 

-1.000 

-9.000 

-9.000 

-1.000 

CASE 

6 

0.000 

-10.000 

-10.000 

0.000 

CASE 

7 

1.000 

-9.000 

-9.000 

1.000 

CASE 

8 

2.000 

-6.000 

-6.000 

2.000 

CASE 

9 

3.000 

-1.000 

-1.000 

3.000 

CASE 

10 

4.000 

6.000 

6.000 

4.000 

CASE 

11 

5.000 

15.000 

15.000 

5.000 

TCi  5=5  <*iixi  +  yi2x2>Jxi 


bilXl  +  bi2X2 


PCl  =  0X1  +  1X2 


We  can  interpret  the  results  as  follows: 


1)  The  first  principal  component  is 


pcx  =  o*x1  +  i*x2  -  x2 

2) 

pc2  *  l‘Xx  +  o*x2  ■=  xx 

3) 

Var(PC1)  *  eigenvalue 

■=  85.8  * 

Var(X2) 

4) 

Var(PC2)  «  eigenvalue 

«  11.0  « 

Var(X1) 

The  PCs  will 

be 

the  same  as  the  Xs 

whenever  the  Xs  are 

uncorrelated . 

Since  X2  has  the  larger  variance, 

it  becomes  the 

first  principal  component. 

If  PCA  is  performed  on  the  correlation  matrix,  we  get  slightly 
different  results. 

Correlation  Matrix  of  Z1  and  Z2 

1  O' 

0  1. 

A  correlation  matrix  always  has  unities  along  its  diagonal  and 
the  correlation  between  the  ith  variable  and  the  jth  variable  in 

x.  u 

the  i  row  and  j  column.  PCA  in  SYSTAT  would  yield  the 
following  output: 
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MATRIX  TO  BE  FACTORED  *  Correlation  Matrix  (r..) 


XI  X2 

XI  rtl  =  1.000 

X2  r12  =  r21  =  -0.000  r22  =  1.000 

LATENT  ROOTS  (EIGENVALUES)  =  A^ 

1  2 

Xt  =  1.000  X2  =  1.000  Xi  =  m 

COMPONENT  LOADINGS  =  b.  4x7  =  A. 

IX  1 


1  -  At 

2  =  A2 

bi  - 11 

mm 

0]  >ll 

-  [1 

0] 

XI 

1.000 

0.000 

X2 

0.000 

1.000 

VARIANCE  EXPLAINED  BY 

COMPONENTS 

1 

2 

1.000 

1.000 

PERCENT  OF  TOTAL  VARIANCE  EXPLAINED  =  proportion  of  variance  explained  by  PC^ 

1  2 
50.000  50.000 


FACTOR 

SCORE 

COEFFICIENTS  =  ^ 

/  -  yt 

1  = 

2  -  y2 

XI 

1.000 

0.000 

X2 

0.000 

1.000 

XI 

X2 

FACTOR (1) 

FACTOR (2) 

=PCt 

=PC2 

CASE 

1 

-5.000 

15.000 

-1.508 

1.619 

CASE 

2 

-4.000 

6.000 

-1.206 

0.648 

CASE 

3 

-3.000 

-1.000 

-0.905 

-0.108 

CASE 

4 

-2.000 

-6.000 

-0.603 

-0.648 

CASE 

5 

-1.000 

-9.000 

-0.302 

-0.972 

CASE 

6 

0.000 

-10.000 

0.000 

-1.080 

CASE 

7 

1.000 

-9.000 

0.302 

-0.972 

CASE 

8 

2.000 

-6.000 

0.603 

-0.648 

CASE 

9 

3.000 

-1.000 

0.905 

-0.108 

CASE 

10 

4.000 

6.000 

1.206 

0.648 

CASE 

11 

5.000 

15.000 

1.508 

1.619 

TCi 

"  + 

yi2vs2 

"  bilXl/Sl  + 

bi2VS2 

PC1 

=  1X^3. 32  + 

0X2/9.26 

for 

case  1, 

=  -5/3.32 
=  -1.508 


The  principal  components  are  again  the  Xs  (standardized  Zs)  themselves, 
but  the  eigenvalues  (var(PCs))  are  unity  since  the  variables  have  been 
standardized  first. 


Example  2:  Let  X1  =  Z^,  X2  =  2Z1  and  X3  =  Z2>  The  summary  statistics 
are  given  below. 

XI  X2  X3 

MEAN  0.000000  0.000000  0.000000 

ST  DEV  3.316625  6.63325  9.262829 

If  the  analysis  is  performed  on  the  variance-covariance  matrix  using 

SYSTAT  the  results  are: 
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MATRIX  TO  BE  FACTORED  =  Covariance  Matrix  ) 


XI 

X2 

X3 

Note:  SYSTAT  does 

not  give  covariances 

XI 

11.000 

above  the  diagonal 

X2 

22.000 

44.000 

X3 

-0.000 

-0.000 

85.800 

LATENT  ROOTS  (EIGENVALUES)  =  X± 


Note: 


1=1  si 


2“  ,  X. 

1=1  1 


85.800 


55.000 


0.000 


COMPONENT  LOADINGS  -  bi  4\i  =  A± 


1 

2 

b’2  =  [3.317 

6.633  0]/4  55 

XI 

0.000 

3.317 

=  [.447 

.894  0] 

X2 

0.000 

6.633 

X3 

9.263 

0.000 

Note:  The  3rd  component  loadings  were  0's  and  are  not  printed  by  SYSTAT. 

VARIANCE  EXPLAINED  BY  COMPONENTS 

1  2 

85.800  55.000 

PERCENT  OF  TOTAL  VARIANCE  EXPLAINED 

1  2 

60.938  39.063 
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FACTOR  SCORE  COEFFICIENTS  =  bi  /  =  yi 


1  -  Yi  2  -  ya 


XI 

0.000 

0.060 

X2 

0.000 

0.121 

X3 

0.108 

0.000 

XI 

X2 

X3 

FACTOR (1) 

FACTOR ( 2 ) 

“PC* 

=PC2 

CASE 

1 

-5.000 

-10.000 

15.000 

15.000 

-11.180 

CASE 

2 

-4.000 

-8.000 

6.000 

6.000 

-8 . 944 

CASE 

3 

-3.000 

-6.000 

-1.000 

-1.000 

-6.708 

CASE 

4 

-2.000 

-4.000 

-6.000 

-6.000 

-4 . 472 

CASE 

5 

-1.000 

-2.000 

-9.000 

-9.000 

-2 .236 

CASE 

6 

0.000 

0.000 

-10.000 

-10.000 

0.000 

CASE 

7 

1.000 

2.000 

-9.000 

-9.000 

2.236 

CASE 

8 

2.000 

4.000 

-6.000 

-6.000 

4 .472 

CASE 

9 

3.000 

6.000 

-1.000 

-1.000 

6.708 

CASE 

10 

4.000 

8.000 

6.000 

6.000 

8.944 

CASE 

11 

5.000 

10.000 

15.000 

15.000 

11.180 

TCi  =  <*ilXl 

+  yi2x2 

+  yi3*3>  ^ 

i 

“  bilXl  +  bi2X2  +  bi3X3 


PC2  -  .  447Xj  4-  .  894X2  4-  OXj 

for  case  1, 


Analyzing  the  correlation  matrix  gives  the  following  results: 


MATRIX  TO  BE  FACTORED  =  Correlation  Matrix  (r. .) 


XI 

X2 

X3 

XI 

1.000 

X2 

1.000 

1.000 

X3 

-0.000 

-0.000 

1.000 

LATENT  ROOTS  (EIGENVALUES)  =  A^ 

12  3 

2.000  1.000  0.000 


COMPONENT  LOADINGS  *  **  A  L 


1 

2 

bj  =  [1  1 

-  [.707 

XI 

1.000 

0.000 

X2 

1.000 

0.000 

X3 

0.000 

1.000 

VARIANCE  EXPLAINED  BY  COMPONENTS 

1  2 

2.000  1.000 


0]  /  2 
.707  0] 


PERCENT  OF  TOTAL  VARIANCE  EXPLAINED 


•■y 

4 

'  ’♦  i 

, 

* 

f  ‘ 

FACTOR 

SCORE 

COEFFICIENTS  - 

/  -tXi  -  yt 

1  »  yi 

2  -  y2 

•V 

XI 

0.500 

0.000 

X2 

0.500 

0.000 

1* 

C 

X3 

0.000 

1.000 

t 

*  ' 

XI 

X2 

X3  FACTOR (1) 

FACTOR ( 2 ) 

-PC, 

«=pc 

1 

2 

CASE 

1 

-5.000 

-10.000 

15.000  -1.508 

1.619 

v.> 

CASE 

2 

-4.000 

-8.000 

6.000  -1.206 

0.648 

>  i 

CASE 

3 

-3.000 

-6.000 

-1.000  -0.905 

-0.108 

CASE 

4 

-2.000 

-4.000 

-6.000  -0.603 

-0.648 

•• 

CASE 

5 

-1.000 

-2.000 

-9.000  -0.302 

-0.972 

CASE 

6 

0.000 

0.000 

-10.000  0.000 

-1.080 

v  *» 

V  *» 

CASE 

7 

1.000 

2.000 

-9.000  0.302 

-0.972 

CASE 

8 

2.000 

4.000 

-6.000  0.603 

-0.648 

,y;> 

,V!’ 

CASE 

CASE 

9 

10 

3.000 

4.000 

6.000 

8.000 

-1.000  0.905 

6.000  1.206 

-0.108 

0.648 

sy.‘ 

CASE 

11 

5.000 

10.000 

15.000  1.508 

1.619 

i 

«=!  *  *iixi/si  ♦  yi2Vs2 

+  yi3  vs3 

% 

$ 

-  <bilVSl  +  bi2VS2 

+  bi3x3/s3)  /  4x[ 

.  <v 

PC1  -  (.707 

Xj/3 . 317  +  .707  X2/6.633  +  0  X 

263)  /  42 

• 

for  case  1, 

«<;•< 

v  '< 

-  . 707 ( 

-5J/3.317  +  .707 (-10J/6.633)  /  42 

r*.l 

1 

* 

-1.508 

* 

$ 

I1 

1 4*i 

■  ‘  \ 

X*i' 

•Jin 
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There  are  several  items  to  note  in  these  analyses: 

i)  There  are  only  two  nonzero  eigenvalues  since  X2  can  be  computed 

from  . 

ii)  X3  is  its  own  principal  component  since  it  is  uncorrelated  with 
all  the  other  variables. 

iii)  The  sum  of  the  eigenvalues  is  the  sum  of  the  variances,  i.e., 

11  +  44  +  85.8  =  140.8 

and 

1  +  1  +  1  =  3  . 

iv)  For  the  variance-covariance  analysis,  the  ratio  of  the 

coefficients  of  X^  and  X2  in  PC2  is  the  same  as  the  ratio  of 

the  variables  themselves  (since  X2  =  2X1) . 

v)  Since  there  are  only  two  nonzero  eigenvalues,  only  two  of 
the  PCs  have  nonzero  variances  (are  nonconstant) . 

vi)  The  coefficients  help  to  relate  the  variables  and  the  PCs.  In 
the  variance-covariance  analysis, 


Thus,  in  both  these  cases,  the  variable  is  perfectly 
correlated  with  the  PC. 

vii)  The  Xs  can  be  reconstructed  exactly  from  the  PCs  with 
nonzero  eigenvalues.  For  example,  in  the  variance- 
covariance  analysis,  X3  is  clearly  given  by  PC1  .  x.^  and 

X2  can  be  recovered  via  the  formulas 

X1  =  pc2/v^ 

X2  =  2*PC2/>/5  . 

As  a  numerical  example, 

-5  =  -11.180/V5  . 

Example  3:  For  Example  3  we  use  X1  =  X2  =  2(Z1+5),  X3  =  3(Z1 

+5)  and  X4  =  Z2.  Thus  X^  X2  and  X3  are  all  created  from  Z  . 

The  data  and  summary  statistics  are: 

OBS  XI  X2  X3  X4 


MEAN 
ST  DEV 


0.000000 

3.316625 


XI 

X2 

X3 

X4 

-5 

0 

0 

15 

-4 

2 

3 

6 

-3 

4 

6 

-1 

-2 

6 

9 

-6 

-1 

8 

12 

-9 

0 

10 

15 

-10 

1 

12 

18 

-9 

2 

14 

21 

-6 

3 

16 

24 

-1 

4 

18 

27 

6 

5 

20 

30 

15 

X2 

X3 

10 

.00000 

15 

.00000 

6 

.63325 

9 

.94987 

0.00000 

9.62823 


The  analyses  for  the  variance-covariance  matrix  (unstandardized 
analysis)  and  correlation  matrix  (standardized  analysis)  are 
given  below. 

9- 


Me 


■V-'i 


TO  BE 

FACTORED  =  Covariance 

Matrix 

<Sij) 

XI 

X2 

X4 

X3 

XI 

11.000 

X2 

22.000 

44.000 

X4 

-0.000 

-0.000 

85.800 

X3 

33.000 

66.000 

-0.000 

99.000 

Note  the  order  that  SYSTAT  prints  variable  information.  (Order  is  set  by 
SYSTAT  based  on  order  variables  were  created) . 

LATENT  ROOTS  (EIGENVALUES)  =  A^ 


154.000 


2 

85.800 


3 

0.000 


4 

-0.000 


COMPONENT  LOADINGS  =  b.  4x.  =  A. 

_1  X  _1 


1 

2  b^  =  [3.317 

=  [.267 

XI 

3 .317 

0.000 

X2 

6.633 

0.000 

X4 

0.000 

-9,263 

X3 

9.950 

0.000 

Note:  The  3rd  and 

4th  component 

loadings  were  0 

VARIANCE  EXPLAINED  BY 

COMPONENTS 

1 

2 

154.000 

85.800 

6.633  0  9 . 950J/J 154 

.535  0  .802] 


PERCENT  OF  TOTAL  VARIANCE  EXPLAINED 


FACTOR  SCORE  COEFFICIENTS  = 


=  b .  /  4  A  .  =  y . 
i  '  1  1 1 


1  =  Yi 


2  =  y. 


XI 

0.022 

0.000 

X2 

0.043 

0.000 

§M,‘ 

X4 

0.000 

-0.108 

1 

X3 

0.065 

FACTOR (1) 
=  PC, 

0.000 

FACTOR (2) 

=  pc„ 

Iffc 

1 

2 

rax  CASE 

1 

-18.708 

-15.000 

CASE 

2 

-14.967 

-6.000 

CASE 
ftfe  CASE 

3 

-11.225 

1.000 

4 

-7.483 

6.000 

Hj  CASE 

5 

-3.742 

9.000 

§M  CASE 

6 

-0.000 

10.000 

CASE 

/ 

3.742 

9.000 

•a*?  CASE 

StfJ;  case 

8 

7.483 

6.000 

9 

11.225 

1.000 

fcyj  CASE 

10 

14.967 

-6.000 

||U  .  CASE 

11 

18.708 

-15.000 

PCi 

-  bil  (X1-*1>  + 

bi2  <V*2> 

PC1 

for 

=  0.267  (Xj-0)  + 

case  1, 

0.535  (X2-l 

I: 

=  0 . 267 (—5)  +  0. 

535(0-10)  + 

=  -18.71 
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MATRIX  TO  BE  FACTORED  =  Correlation  Matrix  (r„) 


XI 

X2 

X4 

X3 

XI 

1.000 

X2 

1.000 

1.000 

X4 

-0.000 

-0.000 

1.000 

X3 

1.000 

1.000 

-0.000 

1.000 

LATENT  ROOTS  (EIGENVALUES)  =  XA 


1 

3.000 


2 

1.000 


3 

0.000 


4 

-0.000 


COMPONENT  LOADINGS  =  b.  4X.  =  A. 

11  l 


1 

2 

-  [1  1 

0  1) 

/  ^ 

=  [.577 

.577 

0  .577] 

XI 

1.000 

0.000 

2 

X2 

1.000 

0.000 

& 

X4 

-0.000 

-1.000 

y 

X3 

1.000 

0.000 

VARIANCE  EXPLAINED  BY  COMPONENTS 


1 

3.000 


2 

1.000 


PERCENT  OF  TOTAL  VARIANCE  EXPLAINED 


1 

75.000 


2 

25.000 
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FACTOR  SCORE 

COEFFICIENTS  =  / 

Jxi  -  yi 

i  -  yx 

2  =  It 

XI 

0.333 

-0.000 

X2 

0.333 

-0.000 

X4 

-0.000 

-1.000 

X3 

0.333 

-0.000 

FACTOR (1) 

FACTOR (2) 

=PC, 

=PC„ 

1 

2 

CASE 

1 

-1.508 

-1.619 

CASE 

2 

-1.206 

-0.648 

CASE 

3 

-0.905 

0.108 

CASE 

4 

-0.603 

0.648 

CASE 

5 

-0.302 

0.972 

CASE 

6 

0.000 

1.080 

CASE 

7 

0.302 

0.972 

CASE 

8 

0.603 

0.648 

CASE 

9 

0.905 

0.108 

CASE 

10 

1.206 

-0.648 

CASE 

11 

1.508 

-1.619 

-  yil(Xl-Xl)/Sl  +  yi2(x2-X2)/=2  +  yi3<X3‘S3,/S3  +  yi4<X4"X4>/S 
PC1  *  .SSSCXj-OJ/S.SlV  +  .333(X2-10)/6.633  +  . 333 (X3“15>/9 . 950  +  0 

for  case  1 

=  .333 (— 5) /3 .317  +  . 333 (-10) /6 . 633  +  . 333 (-15)/9.950 


For  the  variance-covariance  analysis,  the  coefficients  in  PC^  are 

in  the  same  ratio  as  their  relationship  to  Zj  .  In  the 

correlation  analysis  X1,  X2  and  X3  have  equal  coefficients.  In 

both  analyses,  as  expected,  the  total  variance  is  equal  to  the 
sum  of  the  variances  for  the  PCs.  In  both  cases  two  PCs,  PC^  and 

PC4,  have  zero  variance  and  are  identically  zero. 

Example  4.  In  this  example  we  take  more  complicated  combinations 
of  and  Z2- 


i 


Note  that  X^,  X2  and  X3 
unity)  and  X4 ,  Xg  , 
correlations  with  X^  The  data  and  data  summaries  are  below. 


X,  =  Z 


X.  = 


1 

2Z, 


X,  =  3Z. 


X.  = 


V2  +  Z2 


X.  = 


V4  +  Z2 


X.  = 


V8  +  z2 


X.,  =  Z. 


are  colinear  (they  all  have  correlation 


Xg  and  Xy  have  steadily  decreasing 


X5 


X6 


X7 


Mean 


-5.000  - 

10.000  - 

15.000 

12.500 

13.750 

14 . 375 

15.000 

-4.000 

-8.000  - 

12.000 

4.000 

5.000 

5.500 

6.000 

-3 . 000 

-6.000 

-9.000 

-2.500 

-1.750 

-1.375 

-1.000 

-2.000 

-4.000 

-6.000 

-7.000 

-6.500 

-6.250 

-6.000 

-1.000 

-2.000 

-2.000 

-9.500 

-9.250 

-9.125 

-9.000 

0.000 

0.000 

0.000  - 

■10.000  - 

10.000 

-10.000  - 

10.000 

1.000 

2.000 

3.000 

-8.500 

-8.755 

-8.875 

-9.000 

2.000 

4.000 

6.000 

-5.000 

-5.500 

-5.750 

-6.000 

3.000 

6.000 

9.000 

0.500 

-0.250 

-0.625 

-1.000 

4.000 

8.000 

12.000 

8.000 

7.000 

6.500 

6.000 

5.000 

10.000 

15.000 

17.500 

16.250 

15.625 

15.000 

XI 

X2 

X3 

X4 

X5 

X6 

X7 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.000 

3.31662 

6.63325 

9.94987 

9.41010 

9.29987 

9.27210 

9.262 

The  PCAs  for  the  variance-covariance  and  correlation  matrices  are 
given  below. 


.Ml 


MATRIX  TO  BE  FACTORED  =  Covariance  Matrix 


XI 

X2 

X4 

X5 

X6 

XI 

11.000 

X2 

22.000 

44.000 

X4 

5.500 

11.000 

88.550 

X5 

2.750 

5.500 

87.175 

86.488 

X6 

1.375 

2.750 

86.488 

86.144 

85.972 

X7 

-0.000 

-0.000 

85.800 

85.800 

85.800 

X3 

33.000 

66.000 

16.500 

8.250 

4.125 

X7 

X3 

X7 

85.800 

X3 

-0.000 

99.000 

LATENT  ROOTS  (EIGENVALUES)  = 


1 


2 


3 


4 


5 


347.015  153.794 


0.000 


0.000  -0.000 


6  7 

0.000  -0.000 


COMPONENT  LOADINGS  =  bA  J  A  ±  =  A  ^ 


1 

2 

=  [.466 

.932  9.404  9.287  9.229 

XI 

0.466 

3.284 

9.171 

1.398]  /  "1  347 . 015 

X2 

0.932 

6.567 

=  [.025 

.050  .505  .499  .495  .492  .075] 

X4 

9.404 

0.340 

X5 

9.287 

-0.481 

X6 

9.229 

-0.891 

X7 

9.171 

-1.302 

X3 

1.398 

9.851 
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VARIANCE  EXPLAINED  BY  COMPONENTS 


m 


ft™ 


;i| 
W 


t’.'l 


3  ?  **3 


m 


r  >  . 


347.015 


2 

153.794 


PERCENT  OF  TOTAL  VARIANCE  EXPLAINED 

1  2 
69.291  30.709 


FACTOR 

SCORE 

COEFFICIENTS  *  bi/J\i  =  y ^ 

1  =  y» 

2  =  y2 

XI 

0.001 

0.021 

X2 

0.003 

0.043 

X4 

0.027 

0.002 

X5 

0.027 

-0.003 

X6 

0.027 

-0.006 

X7 

0.026 

-0.008 

X3 

0.004 

0.064 

FACTOR (1) 

FACTOR (2) 

CASE 

1 

25.921 

-21.332 

CASE 

2 

8.790 

-15.937 

CASE 

3 

-4.359 

-10.918 

CASE 

4 

-13.525 

-6.275 

CASE 

5 

-18.709 

-2.009 

CASE 

6 

-19.911 

1.881 

CASE 

7 

-17.131 

5.395 

CASE 

8 

-10.368 

8.533 

CASE 

9 

0.377 

11.294 

CASE 

10 

15.104 

13.679 

CASE 

11 

33.813 

15.688 

PCA  =  b 

ilXl  + 

bi2X2  +  bi3X3  +  b 

14X4  +  bi5X5 

PC1  -  * 

025X1  + 

•050X2  +  .075X3 

+  . 505X .  +  . 

4 

for  case  1 

25.921 

=  . 025 ( 

-5)  +  . 050 (-10)  + 

•  075 (-15)  + 

+  bi6X6  +  bi7X7 


499X-  +  . 495X-  + 
o  o 


492X„ 


,495(14.375)  +.  492(15) 
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mm. . 


MATRIX  TO  BE  FACTORED  =  Correlation  Matrix 


XI 

X2 

X4 

X5 

X6 

XI 

1.000 

X2 

1.000 

1.000 

X4 

0.176 

0.176 

1.000 

X5 

0.089 

0.089 

0.996 

1.000 

X6 

0.045 

0.045 

0.991 

0.999 

1.000 

X7 

-0.000 

-0.000 

0.984 

0.996 

0.999 

X3 

1.000 

1.000 

0.176 

0.089 

0.045 

X7 

X3 

X7 

1.000 

X3 

-0.000 

1.000 

LATENT  ROOTS 

(EIGENVALUES)  = 

1 

2 

3 

4 

5 

4.052 

2.948 

0.000 

0.000 

0.000 

6 

7 

-0.000 

-0.000 

COMPONENT  LOADINGS 

-  Uj.  4xi  =  Ki 

1 

2 

XI 

0.290 

-0.957 

X2 

0.290 

-0.957 

X4 

0.993 

0.117 

X5 

0.979 

0.204 

X6 

0.969 

0.247 

X7 

0.957 

0.290 

X3 

0.290 

-0.957 

-28- 


VARIANCE  EXPLAINED  BY  COMPONENTS 


1  2 

4.052  2.948 


PERCENT  OF  TOTAL  VARIANCE  EXPLAINED 


1  2 
57.888  42.112 


fc 

£ 

R 


FACTOR  SCORE 

COEFFICIENTS  -  b±  / 

1  = 

>i  i 

II 

(N 

XI 

0.072 

-0.325 

X2 

0.072 

-0.325 

X4 

0.245 

0.040 

X5 

0.242 

0.069 

X6 

0.239 

0.084 

X7 

0.236 

0.099 

X3 

0.072 

-0.325 

FACTOR (1) 

FACTOR (2) 

CASE 

1 

1.112 

1.913 

CASE 

2 

0.270 

1.342 

CASE 

3 

-0.366 

0.834 

CASE 

4 

-0.795 

0.389 

CASE 

5 

-1.017 

0.006 

CASE 

6 

-1.033 

-0.314 

CASE 

7 

-0.842 

-0.571 

CASE 

8 

-0.445 

-0.765 

CASE 

9 

0.159 

-0.897 

CASE 

10 

0.970 

-0.966 

CASE 

11 

1.987 

-0.972 

H 


We  note  several  things: 

i)  In  both  analyses  there  are  only  two  eigenvalues  that  are  nonzero 
indicating  that  only  two  variables  are  needed.  This  is  not 
readily  apparent  from  the  correlation  or  variance-covariance 
matrix. 

ii)  In  PC^  P^2  and  PC3  where  the  standardizrd  X2  and  X3  are 

the  same,  they  have  the  same  coefficients, 

iii)  Neither  PCA  recovers  Z1  and  Z^.  The  PCAs  with  nonzero  variances 

have  elements  of  both  Z^  and  Z2  in  them,  i.e.,  neither  PC3  or 

PC2  is  perfectly  correlated  with  one  of  the  Zs. 

4.  SUMMARY 

PCA  provides  a  method  of  extracting  structure  from  the 
variance-covariance  or  correlation  matrix.  If  a  multivariate 
data  set  is  actually  constructed  in  a  linear  fashion  from  fewer 
variables,  then  PCA  will  discover  that  structure.  PCA  constructs 
linear  combinations  of  the  original  data,  X,  with  maximal 
variance: 

P  =  XB  . 

This  relationship  can  be  inverted  to  recover  the  Xs  from  the  PCs 
(actually  only  those  PCs  with  nonzero  eigenvalues  are  needed  - 
see  example  2).  Though  PCA  will  often  help  discover  structure  in 
a  data  set,  it  does  have  limitations.  Ii  will  not  necessarily 
recover  the  exact  underlying  variables,  even  if  they  were 
uncorrelated  (Example  4) .  Also,  by  its  construction,  PCA  is 
limited  to  searching  for  linear  structures  in  the  Xs. 


APPENDIX 


Control  Language 


Control  language  is  typed  in  upper  case  and  comments  are  in  lower 
case.  Refer  to  SYSTAT,  Version  3,  1986,  for  program  documentation. 


FACTOR  -»  typed  from  DOS 

USE  PCA1  -*  instructs  SYSTAT  to  perform  the  analysis  on  the 

previously  saved  data  file  PCA1.SYS 

SAVE  PCACOR1  -»  instructs  SYSTAT  to  save  the  PC  scores  in  order 

that  they  may  be  printed  later  with  the  DATA 
module 

NUMBER  =2  -»  indicates  the  number  of  components  to  print 

FACTOR  -»  instructs  SYSTAT  to  perform  the  PCA  on  all 

variables  in  PCA1 


* 

SYSTAT  will  compute  the  PCA  on  the  correlation  matrix  unless 
otherwise  directed.  To  request  PCA  on  a  variance-covariance 
matrix  add  the  following  command  somewhere  before  the  FACTOR 
command : 


TYPE  -  COVARIANCE 


